miR-129-5p as a biomarker for pathology and cognitive decline in Alzheimer’s disease

Background Alzheimer’s dementia (AD) pathogenesis involves complex mechanisms, including microRNA (miRNA) dysregulation. Integrative network and machine learning analysis of miRNA can provide insights into AD pathology and prognostic/diagnostic biomarkers. Methods We performed co-expression network analysis to identify network modules associated with AD, its neuropathology markers, and cognition using brain tissue miRNA profiles from the Religious Orders Study and Rush Memory and Aging Project (ROS/MAP) (N = 702) as a discovery dataset. We performed association analysis of hub miRNAs with AD, its neuropathology markers, and cognition. After selecting target genes of the hub miRNAs, we performed association analysis of the hub miRNAs with their target genes and then performed pathway-based enrichment analysis. For replication, we performed a consensus miRNA co-expression network analysis using the ROS/MAP dataset and an independent dataset (N = 16) from the Gene Expression Omnibus (GEO). Furthermore, we performed a machine learning approach to assess the performance of hub miRNAs for AD classification. Results Network analysis identified a glucose metabolism pathway-enriched module (M3) as significantly associated with AD and cognition. Five hub miRNAs (miR-129-5p, miR-433, miR-1260, miR-200a, and miR-221) of M3 had significant associations with AD clinical and/or pathologic traits, with miR129-5p by far the strongest across all phenotypes. Gene-set enrichment analysis of target genes associated with their corresponding hub miRNAs identified significantly enriched biological pathways including ErbB, AMPK, MAPK, and mTOR signaling pathways. Consensus network analysis identified two AD-associated consensus network modules and two hub miRNAs (miR-129-5p and miR-221). Machine learning analysis showed that the AD classification performance (area under the curve (AUC) = 0.807) of age, sex, and APOE ε4 carrier status was significantly improved by 6.3% with inclusion of five AD-associated hub miRNAs. Conclusions Integrative network and machine learning analysis identified miRNA signatures, especially miR-129-5p, as associated with AD, its neuropathology markers, and cognition, enhancing our understanding of AD pathogenesis and leading to better performance of AD classification as potential diagnostic/prognostic biomarkers. Supplementary Information The online version contains supplementary material available at 10.1186/s13195-023-01366-8.


Background
Alzheimer's disease is a prevalent cause of dementia, accounting for approximately 60% to 80% of cases [1].It is characterized by the extracellular deposition of amyloid-β (Aβ) in the form of diffuse and neuritic plaques (NPs) and the presence of intracellular neurofibrillary tangles (NFTs) comprised of aggregated hyperphosphorylated tau protein [2].However, the exact mechanisms underlying the pathogenesis of Alzheimer's disease remain unclear due to the involvement of complex neurochemical and genetic factors [3].Dysregulated expression of microRNAs (miRNAs) is a potential mechanism contributing to gene expression changes in Alzheimer's disease [4][5][6].
miRNAs are endogenous single-stranded RNA molecules approximately 20-23 nucleotides long [7].They primarily repress the translation of specific messenger RNAs (mRNAs) by binding to their 3′-untranslated regions (3′-UTR) [8].Several miRNAs have been identified as being dysregulated in Alzheimer's disease, with certain miRNAs being highly expressed in the brain [4,5,9,10].However, there has been limited consensus among studies conducted on relatively small or modest sample sizes [11].
The construction of network modules based on the correlation of miRNA expression profiles can reveal the global properties of biological organization [12], given the assumption that miRNAs involved in similar functions tend to be co-expressed [13].The weighted gene co-expression network analysis (WGCNA) approach is a method that focuses on gene co-expression networks and has been useful in describing the system-level correlation structure among transcripts [14].Additionally, the network-based approach is a dimensionality reduction technique for analyzing high-dimensional omics data, providing insights into the pathogenesis of multifactorial disorders [15].Therefore, the WGCNA approach has been utilized to enhance our understanding of the pathogenesis of Alzheimer's disease [16].
In this study, using miRNA expression profiles from a large longitudinal study of aging, the Religious Orders Study and Rush Memory and Aging Project (ROS/MAP) (N = 702), as a discovery sample, we performed differential expression analysis and co-expression network analysis to identify Alzheimer's dementia (AD)-associated network modules and their hub miRNAs.We also investigated their association with neuropathological markers [2] and cognition.After selecting target genes of the hub miRNAs, we performed association analysis of the hub miRNAs with their target genes and then differential expression analysis of target genes using brain tissue RNA-Seq data.For replication analysis, we performed a consensus miRNA co-expression network analysis to identify AD-associated consensus network modules and their hub miRNAs using the ROS/MAP dataset and an independent miRNA expression profile dataset (N = 16) from Gene Expression Omnibus (GEO).Finally, we employed a machine learning approach to assess the performance of hub miRNAs for the classification of AD.

Study samples
Two independent datasets were used: ROS/MAP and GEO.In the ROS/MAP cohort, subjects were categorized as having no cognitive impairment (NCI) or AD [17,18].In this study, to achieve a more robust differentiation between AD and NCI by employing both clinical and neuropathology criteria, AD was defined by Braak NFT scores [19] ≧ 4, Consortium to Establish a Registry for Alzheimer's Disease (CERAD) scores [20] of definite Alzheimer's disease (frequent NPs) or probable Alzheimer's disease (moderate NPs), and cognitive diagnosis of probable Alzheimer's disease with no other causes.NCI was defined by Braak scores ≦ 3, the CERAD scores of possible Alzheimer's disease (sparse NPs) or no Alzheimer's disease, and clinical diagnosis of no cognitive impairment [21].In the GEO (GSE157239) dataset from the Human Brain Bank of the Brazilian Aging Brain Study Group, AD was defined by Braak NFT scores [19] ≧ 3 and NCI as subjects without neuropathological lesions or neurological signs, as previously described [22].

miRNA profiling data
For the ROS/MAP cohort, miRNA profile data were downloaded from the Accelerating Medicines Partnership for Alzheimer's Disease (AMP-AD) Knowledge Portal on Synapse (syn3387325) (https:// www.synap se.org).The miRNAs were extracted from bulk brain tissue in the dorsolateral prefrontal cortex (DLPFC) using the Nanostring nCounter Human miRNA Expression assay kit and annotated using definitions from the miR-Base [23].These miRNAs were eluted from the miRNeasy spin columns in buffer and tested by Nanodrop and Bioanalyzer RNA 6000 Nano Agilent chips.The dataset consisted of 309 miRNAs from 702 individuals after correcting for the probe-specific backgrounds and performing a three-step filtering of sample and miRNA expression [6,24].The miRNA data were normalized using a combination of quantile normalization and Combat [25] to remove batch effects.The miRNAs identified by microarray were validated with specific real-time reverse transcription PCR (qRT-PCR) assays, as previously described in detail [6,24].
For the GEO dataset, miRNA profiles were downloaded from the National Center for Biotechnology Information as accession numbers GSE157239 (http:// www.ncbi.nlm.nih.gov/ proje cts/ geo/).The miRNAs were extracted from bulk brain tissue in the temporal cortex, profiled using microarray in the Affymetrix miRNA Array, and annotated using definitions from the miRBase [23].After the isolation and biotin labeling of the miRNAs was performed, the labeled miRNAs were hybridized to the GeneTitan instrument with the Array Strip Hybridization kit [26].Quality control was performed using the Expression Console software [27], and results were exported for processing in the Transcriptome Analysis Console software [28].The miRNA identified by microarray were also validated via qRT-PCR, as previously described [22].

RNA-Sequencing for mRNA expression data in the ROS/ MAP
RNA-Seq data generated from brain tissue in the DLPFC were downloaded from the AMP-AD Knowledge Portal on Synapse (syn8456638) (https:// www.synap se.org).The sequencing was performed on the Illumina HiSeq2000 with 101 base pair paired-end reads, targeting a coverage of 50 million paired reads, as detailed in previous studies [24,29].The reads were aligned to the human genome reference (hg19) using Tophat [30] with Bowtie1 as the aligner [31].The expression levels of transcripts were estimated using the Gencode V14 annotation with the RSEM package [32].FPKM (fragments per kilobase of transcript per million mapped reads) normalization was applied to the mRNA expression data.The log2 countsper-million (logCPM) values generated in 634 subjects were finally used for further analysis, as previously described [33].

Assessment of CERAD, Braak, and cognition in ROS/MAP
Definite or probable Alzheimer's disease relative to possible or no Alzheimer's disease was based on CERAD, which is a semiquantitative measure of neuritic plaques, as previously reported [20,34,35].Braak scores [19] ≧ 4 relative to ≦ 3 were used to dichotomize neurofibrillary tangles as previously reported [34,35].We used the term "CERAD positive" for "probable" or "definite" Alzheimer's disease, and "CERAD negative" for "possible" or "no" Alzheimer's disease based on CERAD.We also used the term "Braak positive" for Braak scores ≧ 4, and "Braak negative" for Braak scores ≦ 3. Cognitive performance was assessed longitudinally by Z-scores of global cognitive performance averaged across 19 tests spanning five cognitive domains, including episodic memory, working memory, semantic memory, perceptual speed, and visuospatial ability/perceptual orientation, as previously reported [36].

Co-expression miRNA network construction
A scale-free miRNA co-expression network was constructed using the WGCNA package based on miRNA expression profiles [14].The "pickSoftThreshold" function was used to select an appropriate soft threshold power β that achieves a scale-free topology, and the network adjacency matrix was calculated based on coexpression similarity.Unsupervised hierarchical clustering with the dynamic tree cut procedure was used to identify modules of co-expressed miRNAs.Each module was represented by a module eigengene (ME), which was defined as the first principal component of the expression matrix representing the overall level of miRNA expression within each module.

Identification of hub miRNAs in an AD-associated network module and their target genes
The "softConnectivity" function was used to identify the top 10 hub miRNAs with the highest intramodular connectivity according to the topological overlap matrix (TOM) based on intramodular connectivity measures.The target genes of these hub miRNAs were predicted using the TargetScan [37] and the miRDB [38] databases, which provide both predicted and experimentally verified interaction information between miRNAs and genes.The overlapping target genes between the two databases were selected for further analysis.

Association of AD-associated hub miRNAs with their target genes using RNA-Seq data in the ROS/MAP
After target genes of AD-associated hub miRNAs were selected, brain tissue-based RNA-Seq data analyses were performed to assess whether the predicted target genes were associated with expression levels of their corresponding miRNAs.

Pathway-based enrichment analysis of target genes and miRNAs
Gene-set enrichment analysis was performed using the Database for Annotation, Visualization and Integrated Discovery (DAVID) online tool [39] to identify the biological pathways of target genes significantly associated with the expression levels of AD-associated hub miRNAs.The Kyoto Encyclopedia of Genes and Genomes (KEGG) [40] and Gene ontology (GO) biological processes (BP) [41] pathways were used.
miRNA-set enrichment analysis was also performed using the TAM 2.0 [42] tool to characterize the functional annotations of the miRNA set within each module through overrepresentation analysis.miRNA-set enrichment analysis was restricted to pathways containing 3 or more miRNAs.The most strongly associated biological processes for the involvement of the central nervous system were selected for each module.Bonferroni correction [43] was applied to adjust for multiple testing.

Consensus network construction
A consensus miRNA co-expression network was constructed across the ROS/MAP and GEO (GSE46579) datasets for replication using the WGCNA package [14].The "blockwiseConsensusModules" function was used to construct modules in the consensus network based on pairwise miRNA dissimilarity measures.The importance of a miRNA in a network module was determined by kME, defined as the strength of the correlation of expression levels of a miRNA with the ME.Hub miRNAs in a consensus module were defined as those miRNAs with an absolute kME value greater than 0.7.
The "modulePreservation" function was used to construct a preservation network based on the correlation between all pairs of consensus ME values across the two networks from the ROS/MAP and GEO datasets.A density (D) of the eigengene network, defined as the average scaled connectivity, was estimated to discover changes in preservation patterns across each consensus module.The D value close to 1 indicates strong preservation of the correlation patterns between all pairs of eigengenes across the two networks [44,45].

Statistical analysis
For differential expression analysis, logistic regression models were used to investigate the association of miR-NAs expression levels with AD, CERAD, and Braak stage.For co-expression miRNA network analysis, logistic regression models were used to evaluate the association of ME values of network modules with AD and to identify AD-associated network modules.Linear regression models were used to perform the association analysis of miRNA expression levels and ME values of network modules with global cognitive performance at the last visit, while linear mixed effects models were used to investigate their association with longitudinal changes of global cognitive performance.
For differential expression analysis using RNA-Seq data of the target genes identified as significantly associated with expression levels of AD-associated hub miRNAs, logistic regression models were used to investigate the association between expression levels of the target genes and AD.
Covariates in the association analysis with AD and neuropathological markers included age, sex, study (ROS or MAP), RNA integrity numbers, and post-mortem interval.For the association analysis with cognitive performance, education was included along with the aforementioned covariates.The false discovery rate (FDR) correction was used to adjust for multiple testing with the Benjamini-Hochberg procedure [46] unless otherwise specified.
The STREAMLINE tool [47], a machine learning approach using penalized logistic regression, was used to investigate the classification performance of ADassociated hub miRNAs in differentiating AD from NCI.This approach was chosen to reduce the effect of multicollinearity on feature selection [48].The data were randomly divided into 70% used for training the model and 30% used for testing, with a 10-fold cross-validation procedure.The classification performance of five different models was evaluated using the receiver operating characteristic (ROC) curve and the area under the receiver operating characteristic curve (AUC  [49].
All statistical analyses were conducted using R version 4.2.0, and statistical significance was set at a P-value of 0.05 after adjusting for multiple comparisons.Figure 1 shows a workflow of all analysis steps used in this study.

Results
The ROS/MAP cohorts with miRNA consisted of 702 participants, including 102 NCI and 177 AD, with a median age of 88.5 at death and 64.1% females.The GEO (GSE157239) dataset consisted of 16 participants, including 8 NCI and 8 AD, having miRNA data, with a median age of 81.5 at death and 68.8% females.Table 1 and Table S1 show the demographic characteristics of these individuals.

Modules associated with AD, CERAD, Braak, and cognition
A scale-free co-expression network was constructed based on the miRNA expression profiles of 702 subjects using WGCNA.A cutting tree height of 11 was selected to eliminate outliers, and 663 individuals under the cutting tree height were kept for further analysis.A soft thresholding power value of β = 4 was selected, and four network modules were identified (Figure S1).The M0 module consisted of miRNAs not assigned to any other modules and was excluded in further analysis.Lower ME values of the M3 module   were significantly associated with a greater likelihood of AD and CERAD, but not with Braak stage.Moreover, lower ME values of the M3 module were associated with lower global cognitive performance at the last visit and faster longitudinal decline of global cognition (Fig. 3 and Table S3).Enrichment analysis of miRNAs revealed that M3 was highly associated with glucose metabolism, and M1 and M2 were strongly linked to innate immunity and embryonic development, respectively (Table S4).

Hub miRNAs associated with AD, CERAD, Braak, and cognition
In the AD-associated glucose metabolism pathwayenriched M3 module, the top 10 miRNAs with the highest TOM-based intramodular connectivity were identified as hub miRNAs (Fig. 4).Among these 10 miRNAs, the strongest finding was for miR-129-5p, which was inversely related to AD, CERAD, and Braak and directly related to level of and change in cognition.
Higher expression levels of miR-433 and miR-221 were inversely associated with the likelihood of AD, while only miR-433 was associated with better cognition and slower cognitive decline (Fig. 4 and Table S5).By contrast, higher expression levels of miR-200a and miR-1260 were related to a greater likelihood of AD, higher CERAD and lower cognition, but not with Braak or cognitive decline.Finally, miR-744 was only associated with lower global cognition.

Pathway-based enrichment analysis of target genes
The KEGG pathway-based enrichment analysis revealed that the target genes identified through RNA-Seq analysis for the five AD-associated hub miRNAs were mainly involved in the following pathways: axon guidance, erythroblastic leukemia viral oncogene homolog (ErbB) signaling pathway, mitogen-activated protein kinase (MAPK) signaling pathway, gamma-aminobutyric acid-ergic (GABAergic) synapse, autophagy, 5′ adenosine monophosphate-activated protein kinase (AMPK) signaling pathway, mammalian target of rapamycin (mTOR) signaling pathway, and glutamatergic synapse (Table 3).The GO-BP pathway-based enrichment analysis showed that target genes identified through RNA-Seq analysis for the five AD-associated hub miRNAs were enriched in the following pathways: protein phosphorylation, nervous system development, chromatin organization, and neuron migration (Table S6).

Consensus network analysis using miRNA profiles from two independent datasets
Consensus modules and a preservation network were constructed to understand changes in preservation patterns across consensus modules.Additionally, hub miRNAs in consensus modules were selected to assess the replication of the five AD-associated hub miRNAs, identified in the ROS/MAP dataset, within an independent dataset.

Identification of consensus network modules
The construction of miRNA co-expression network modules was performed separately for the discovery and replication datasets, and the consensus modules were identified using the consensus dissimilarity measures in the average linkage hierarchical clustering method.A soft thresholding power value of β = 4 was selected for each  S2).All consensus modules had counterparts in both datasets, indicating that the consensus module structures in two datasets were similar (Figure S3).The consensus CM0 module consisted of miRNAs not assigned to any other modules.

Preservation of consensus modules
A consensus ME network was constructed to investigate whether expression patterns of modules were correlated with each other (Figure 5A, B).The preservation networks of the correlations of the consensus ME pairs between the discovery and replication datasets were further constructed to understand the changes in preservation patterns of two datasets (Figure 5C).The D value of the preservation networks between all pairs of the consensus ME across the two networks was 0.88 (Figure 5D), indicating that these modules were well preserved in their expression patterns across the two independent datasets.

Identification of AD-associated consensus modules across two independent datasets and replication for AD-associated hub miRNAs
Figure 6 shows the module-clinical trait heatmaps, indicating the associations between AD and four consensus modules across two independent datasets.Lower ME values of the consensus CM2 and CM3 modules were significantly associated with AD in the ROS/MAP dataset.In the GEO dataset, although the association between ME values of the consensus CM2 and CM3 modules and AD was not significant, the effect sizes and association directions were consistent with those in the ROS/MAP dataset.Among the five AD-associated hub miRNAs identified in the discovery dataset (ROS/MAP), miR-129-5p, miR-221, and miR-200a were included in the CM2 module (Table S7), but none of the miRNAs were present in the CM3 module (Table S8).Notably, miR-129-5p and miR-221, identified as AD-associated hub miR-NAs in the ROS/MAP cohort, were also hub miRNAs in an independent replication dataset (GEO) because their kME values in the replication dataset (GEO) were higher than 0.7 (Table 4 and Table S8).The correlation analysis showed that the M3 module from the ROS/ MAP dataset and the consensus CM2 module from the combined ROS/MAP and GEO datasets were strongly correlated (correlation coefficient = 0.88) (Figure S4).
Enrichment analysis of miRNAs identified glucose metabolism as a significantly enriched biological pathway in both M3 and CM2 (Table S9).

Machine learning analysis for AD classification
A machine learning approach using penalized logistic regression for the classification of AD from NCI was used to evaluate five different classification models (Table 5).
The results of 10-fold cross-validation are presented in Figure 7. Model 1, including age, sex, and APOE ε4 carrier status, achieved a mean AUC value of 0.807 with a standard deviation of 0.103 (Figure 7A).The mean AUC value of Model 3, obtained by adding five AD-associated hub miRNAs to Model 1, significantly increased to 0.870 with a standard deviation of 0.061 (P value = 0.022) (Figure 7C), which was comparable to that of Model 5, obtained by adding all 309 miRNAs to Model 1 (Figure 7E).The mean AUC values of Model 2, including only five AD-associated hub miRNAs, and Model 4, including all 309 miRNAs, were 0.740 and 0.815 with a standard deviation of 0.068 and 0.052, respectively (Figure 7B, D).

Discussion
In this study, we performed a network-based analysis of miRNAs and identified a network module (glucose metabolism pathway-enriched M3) that showed significant associations with AD, level of and change in cognition, and CERAD and Braak pathologic traits of Alzheimer's disease.In the AD-associated glucose metabolism pathway-enriched M3 module, we identified five hub miRNAs (miR-129-5p, miR-433, miR-1260, miR-200a, and miR-221) as significantly associated with AD, neuropathologic markers, and/or cognition, with miR-129-5p being the strongest and associated with all AD traits.Gene-set enrichment analysis of the target genes of these five AD-associated hub miRNAs revealed enrichment of ErbB, MAPK, AMPK, and mTOR signaling pathways.In the replication analysis using an independent additional dataset, we identified AD-associated CM2 and CM3 modules.Remarkably, miR-129-5p and miR-221, identified as AD-associated hub miRNAs in the discovery cohort, were also found to be hub miRNAs in the replication cohort.Preservation analysis showed consistent expression patterns of the consensus modules across two independent datasets.This study identified AD-associated miRNAs through network-based analysis, not detected in the miRNAbased differential expression analysis.Moreover, our findings shed light on the association between AD-associated miRNAs and longitudinal changes of cognition.
The machine learning analysis for classification of AD from NCI showed that five AD-associated miRNAs significantly improved the performance of demographic information and APOE ε4 carrier status for classification of AD from a mean AUC value of 0.807 to that of 0.870.
miRNAs have emerged as promising therapeutic targets in Alzheimer's disease due to their crucial role in regulating the expression levels of target genes involved in Alzheimer's disease pathogenesis [10,83,84].Currently, two types of miRNA targets, miRNA mimics and Fig. 7 The ROC curves and mean AUC of machine learning approach using penalized logistic regression.Sensitivity is on the y-axis and 1-specificity is on the x-axis.anti-miRNAs, are being explored for therapeutic interventions in Alzheimer's disease [10,77].
Our study has several limitations.Firstly, the limited sample size of the replication cohort might have contributed to the decreased statistical power for replication analysis.Further studies with larger samples are needed to validate our findings.Secondly, the replication dataset lacks neuropathological markers and cognitive information, limiting our investigation for associations with neuropathology and cognition to the ROS/MAP cohort.Thirdly, miRNA expression profiles were generated using different brain regions and different microarray platforms in the discovery and replication cohorts, which may have introduced variability in the results.Lastly, the difference in the definitions of the diagnostic groups (NCI and AD) in the two datasets may have led to a potential confounding factor in the consensus network analysis.

Conclusions
In summary, our network-based approach identified AD pathology and cognition-associated miRNAs.Notably, miR-129-5p and miR-221 were replicated in an independent dataset.The inclusion of AD-associated miR-NAs improved the classification performance of AD from NCI.This integrative network approach can provide insight into AD pathogenesis and highlights these miR-NAs as diagnostic/prognostic biomarkers and potential therapeutic targets for AD.However, further investigations are necessary to elucidate the underlying mechanisms and validate these findings.

Fig. 1
Fig. 1 Schematic overview of the workflow of our analysis.AD Alzheimer's dementia, MAP Memory and Aging Project, miRNAs microRNAs, mRNAs messenger RNAs, NCI no cognitive impairment, ROS Religious Orders Study, 3′-UTR 3′-untranslated region

Fig. 2
Fig. 2 Heatmap of association analysis results of miRNAs with diagnosis and clinical and pathological traits.The numbers represent the logistic and linear regression coefficients of association between miRNAs and traits.Stars indicate significant associations with FDR-corrected p value < 0.05.Red colors represent positive correlations, while blue colors denote negative correlations.The darker the color, the stronger the association.AD Alzheimer's dementia, CERAD Consortium to Establish a Registry for Alzheimer's Disease, FDR false discovery rate, miRNAs microRNAs.Note: significance stars indicating the p values of the correlations adjusted for multiple comparisons.*P-value < 0.05.**P-value < 0.01.*** P-value < 0.001

Fig. 3
Fig. 3 Association analysis results of module eigengenes with diagnosis and clinical and pathological traits.A The numbers represent the logistic and linear regression coefficients of association between module eigengenes and traits.Stars indicate significant associations with FDR-corrected p value < 0.05.Red colors represent positive correlations, while blue colors denote negative correlations.The darker the color, the stronger the association.B Violin and box plots represent the M3 module eigengene values in the diagnosis, amyloidopathy, and tauopathy groups from the ROS/MAP cohort.C The x-axis represents years before the last clinical evaluation, while the y-axis indicates global cognitive Z scores.The two lines represent the different slopes of global cognitive changes of subjects with 1st quartile (green color) versus 4th quartile (red color) stratified by ME values of AD-associated M3 module.AD Alzheimer's dementia, CERAD Consortium to Establish a Registry for Alzheimer's Disease, FDR false discovery rate, MAP Memory and Aging Project, ME module eigengene, NCI no cognitive impairment, ROS Religious Orders Study.Note: significance stars indicating the p values of the correlations adjustment for multiple comparisons.*P-value < 0.05.**P-value < 0.01

Fig. 4
Fig. 4 Association analysis results of top 10 candidate hub miRNAs with diagnosis and clinical and pathological traits.A The numbers represent the logistic and linear regression coefficients of association between miRNAs and traits.Stars indicate significant associations with FDR-corrected p value < 0.05.Red colors represent positive correlations, while blue colors denote negative correlations.The darker the color, the stronger the association.B, C The x-axis represents years before the last clinical evaluation, while the y-axis indicates global cognitive Z scores.The two lines represent the different slopes of global cognitive changes of subjects with 1st quartile (green color) versus 4th quartile (red color) stratified by expression levels of miR-129-5p (B) and miR-433 (C), respectively.AD Alzheimer's dementia, CERAD Consortium to Establish a Registry for Alzheimer's Disease, FDR false discovery rate, miRNAs microRNAs.Note: significance stars indicating the p values of the correlations adjustment for multiple comparisons.*P-value < 0.05.**P-value < 0.01.***P-value < 0.001

Fig. 5
Fig. 5 Construction of consensus module eigengene networks.A, B Heatmaps showing the adjacencies of eigengene for each of the eigengene networks (A, ROS/MAP cohort; B, GSE157239 cohort).Each row and column corresponds to an eigengene labeled by the consensus module or diagnosis.Red colors represent positive correlations, while blue colors denote negative correlations.The darker the color, the stronger the association.C Adjacency heatmap displaying the pairwise preservation between the two consensus eigengene networks.Each row and column corresponds to an eigengene labeled by the consensus module or diagnosis.Red colors represent adjacency.The darker the color, the higher the adjacency.D Bar plot showing the preservation of correlation of consensus module eigengene between the two networks.Each colored bar corresponds to the eigengene of the associated consensus module.The height of the bar (y-axis) represents the eigengene preservation measure.The D value indicates the mean preservation of eigengene networks across the datasets.The high density value of D (= 0.88) denotes high overall preservation between the two networks.AD Alzheimer's dementia, CM consensus module, M module, MAP Memory and Aging Project, ROS Religious Orders Study

Fig. 6
Fig. 6 Association analysis results of consensus module eigengenes with diagnosis in the ROS/MAP cohort, GEO (GSE157239) cohort, and across ROS/MAP and GEO datasets.A, B Heatmaps showing module-trait relationships in the ROS/MAP (A) and GEO (GSE157239) (B) cohorts.C Heatmaps showing consensus module-trait relationships across the ROS/MAP and GSE157239 cohorts.The numbers in the table indicate the coefficients (top) and its p value (bottom) of association between consensus module eigengenes and diagnosis.Red colors represent positive correlations, while green colors denote negative correlations.The darker the color, the stronger the association.Missing (NA) entry indicates the failure of forming a consensus because the directions of correlations in the ROS/MAP and GSE157239 datasets are opposite.AD Alzheimer's dementia, CM consensus module, GEO Gene Expression Omnibus, M module, MAP Memory and Aging Project, NA not applicable, ROS Religious Orders Study Fig.7 The ROC curves and mean AUC of machine learning approach using penalized logistic regression.Sensitivity is on the y-axis and 1-specificity is on the x-axis.10-fold cross validation was used to investigate and compare the classification performance of five different classification models for differentiating AD from NCI.The gray zone around the mean ROC curve represents ± 1 standard deviation.Five different classification models are shown per following training features.A Training features include age, sex, and apolipoprotein E (APOE) ε4 carrier status.B Training features include five AD-associated hub miRNAs.C Training features include age, sex, APOE ε4 carrier status, and five AD-associated hub miRNAs.D Training features include all 309 miRNAs.E Training features include age, sex, APOE ε4 carrier status, and all 309 miRNAs.AD Alzheimer's dementia, AUC area under the curve, miRNAs microRNAs, NCI no cognitive impairment, ROC receiver operating characteristic ). Training features in Model 1 included age, sex, and apolipoprotein E (APOE) ε4 carrier status; training features in Model 2 included five AD-associated hub miRNAs; training features in Model 3 included age, sex, APOE ε4 carrier status, and five AD-associated hub miRNAs; training features in Model 4 included all 309 miRNAs, and training features in Model 5 included age, sex, APOE ε4 carrier status, and all 309 miRNAs.Paired t-tests were performed to compare the AUC results for the different models

Table 1
Demographic information of participants from the ROS/MAP cohort Values are n (%), unless indicated otherwise.Among a total of 702 participants with miRNA data, 177 subjects met criteria for AD and 102 met criteria for NCI based on neuropathological and clinical data AD Alzheimer's dementia, MAP Memory and Aging Project, miRNAs microRNAs, NCI no cognitive impairment, ROS Religious Orders Study a Data are presented as median (interquartile range) b The Mann-Whitney U test or chi-square test was used to determine the P value for comparisons between AD and NCI groups, as appropriate

Table 2
Results of differential expression analysis of AD-associated target genes of five AD-associated hub miRNAs between NCI and AD.Here we showed only significantly differentially expressed target genes in AD AD Alzheimer's dementia, FDR false discovery rate, miRNAs microRNAs, NCI no cognitive impairment a Adjusted p value using FDR dataset, and four consensus modules were identified (Figure

Table 4
List of miRNAs with kME > 0.7 for the CM2 module in the replication dataset kME and p values represent the correlation and significance levels, respectively, between the miRNA expression levels and module eigengene of the consensus CM2 module.The miRNAs with kME > 0.7 in the replication dataset (GEO) are listed for the CM2 module CM consensus module, FDR false discovery rate, GEO Gene Expression Omnibus, ME module eigengene, miRNAs microRNAs a Adjusted p value using FDR

Table 5
Mean AUC and standard deviation of machine learning models using penalized logistic regression 10-fold cross validation was used to investigate and compare the classification performance of five different machine learning models for differentiating AD from NCI.The machine learning model, mean AUC, and standard deviation of the AUC are presented AD Alzheimer's dementia, APOE apolipoprotein E, AUC area under the curve, miRNAs microRNAs, NCI no cognitive impairment